{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lesson 3 Exercise 1: Three Queries Three Tables\n",
    "<img src=\"images/cassandralogo.png\" width=\"250\" height=\"250\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Walk through the basics of creating a table in Apache Cassandra, inserting rows of data, and doing a simple CQL query to validate the information. You will practice Denormalization, and the concept of 1 table per query, which is an encouraged practice with Apache Cassandra. \n",
    "\n",
    "### Remember, replace ##### with your answer.\n",
    "\n",
    "\n",
    "Note: __Do not__ click the blue Preview button at the bottom"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### We will use a python wrapper/ python driver called cassandra to run the Apache Cassandra queries. This library should be preinstalled but in the future to install this library you can run this command in a notebook to install locally: \n",
    "! pip install cassandra-driver\n",
    "#### More documentation can be found here:  https://datastax.github.io/python-driver/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import Apache Cassandra python package"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cassandra"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a connection to the database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cassandra.cluster import Cluster\n",
    "try: \n",
    "    cluster = Cluster(['127.0.0.1']) #If you have a locally installed Apache Cassandra instance\n",
    "    session = cluster.connect()\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a keyspace to work in"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    session.execute(\"\"\"\n",
    "    CREATE KEYSPACE IF NOT EXISTS udacity \n",
    "    WITH REPLICATION = \n",
    "    { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }\"\"\"\n",
    ")\n",
    "\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Connect to our Keyspace. Compare this to how we had to create a new session in PostgreSQL.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    session.set_keyspace('udacity')\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Let's imagine we would like to start creating a Music Library of albums. \n",
    "\n",
    "### We want to ask 3 questions of the data\n",
    "#### 1. Give every album in the music library that was released in a given year\n",
    "`select * from music_library WHERE YEAR=1970`\n",
    "#### 2. Give every album in the music library that was created by a given artist  \n",
    "`select * from artist_library WHERE artist_name=\"The Beatles\"`\n",
    "#### 3. Give all the information from the music library about a given album\n",
    "`select * from album_library WHERE album_name=\"Close To You\"`\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Because we want to do three different queries, we will need different tables that partition the data differently. \n",
    "<img src=\"images/table1.png\" width=\"350\" height=\"350\">\n",
    "<img src=\"images/table2.png\" width=\"350\" height=\"350\">\n",
    "<img src=\"images/table0.png\" width=\"550\" height=\"550\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TO-DO: Create the tables. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"CREATE TABLE IF NOT EXISTS music_library\"\n",
    "query = query + \"(year int, artist_name text, album_name text, PRIMARY KEY(year, album_name))\"\n",
    "try:\n",
    "    session.execute(query)\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "query1 = \"CREATE TABLE IF NOT EXISTS artist_library\"\n",
    "query1 = query1 + \"(artist_name text, year int, album_name text, PRIMARY KEY(artist_name,year))\"\n",
    "try:\n",
    "    session.execute(query1)\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "query2 = \"CREATE TABLE IF NOT EXISTS album_library\"\n",
    "query2 = query2 + \"(album_name text, artist_name text, year int, PRIMARY KEY(album_name, year))\"\n",
    "try:\n",
    "    session.execute(query2)\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TO-DO: Insert data into the tables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"INSERT INTO music_library (year, artist_name, album_name)\"\n",
    "query = query + \" VALUES (%s, %s, %s)\"\n",
    "\n",
    "query1 = \"INSERT INTO artist_library (artist_name, year, album_name)\"\n",
    "query1 = query1 + \" VALUES (%s, %s, %s)\"\n",
    "\n",
    "query2 = \"INSERT INTO album_library (album_name, artist_name, year)\"\n",
    "query2 = query2 + \" VALUES (%s, %s, %s)\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    session.execute(query, (1970, \"The Beatles\", \"Let it Be\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query, (1965, \"The Beatles\", \"Rubber Soul\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query, (1965, \"The Who\", \"My Generation\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query, (1966, \"The Monkees\", \"The Monkees\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query, (1970, \"The Carpenters\", \"Close To You\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "   \n",
    "try:\n",
    "    session.execute(query1, (\"The Beatles\", 1970, \"Let it Be\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query1, (\"The Beatles\", 1965, \"Rubber Soul\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query1, (\"The Who\", 1965, \"My Generation\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query1, (\"The Monkees\", 1966, \"The Monkees\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query1, (\"The Carpenters\", 1970, \"Close To You\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "   \n",
    "try:\n",
    "    session.execute(query1, (\"The Beatles\", 1970, \"Let it Be\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query1, (\"The Beatles\", 1965, \"Rubber Soul\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query1, (\"The Who\", 1965, \"My Generation\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query1, (\"The Monkees\", 1966, \"The Monkees\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query1, (\"The Carpenters\", 1970, \"Close To You\"))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    session.execute(query2, (\"Let it Be\", \"The Beatles\", 1970))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query2, (\"Rubber Soul\", \"The Beatles\", 1965))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "try:\n",
    "    session.execute(query2, (\"My Generation\", \"The Who\", 1965))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query2, (\"The Monkees\", \"The Monkees\", 1966))\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "\n",
    "try:\n",
    "    session.execute(query2, (\"Close To You\", \"The Carpenters\", 1970))\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This might have felt unnatural to insert duplicate data into the tables. If I just normalized these tables, I wouldn't have to have extra copies! While this is true, remember there are no `JOINS` in Apache Cassandra. For the benefit of high availibity and scalabity, denormalization must be how this is done. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TO-DO: Validate the Data Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1970 The Carpenters Close To You\n",
      "1970 The Beatles Let it Be\n"
     ]
    }
   ],
   "source": [
    "query = \"select * from music_library WHERE year=1970\"\n",
    "try:\n",
    "    rows = session.execute(query)\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "for row in rows:\n",
    "    print (row.year, row.artist_name, row.album_name,)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Your output should be:\n",
    "1970 The Beatles Let it Be<br>\n",
    "1970 The Carpenters Close To You"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TO-DO: Validate the Data Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Beatles 1965 Rubber Soul\n",
      "The Beatles 1970 Let it Be\n"
     ]
    }
   ],
   "source": [
    "query =\"select * from artist_library WHERE artist_name='The Beatles'\"\n",
    "try:\n",
    "    rows = session.execute(query)\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "for row in rows:\n",
    "    print (row.artist_name, row.year, row.album_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Your output should be:\n",
    "The Beatles 1965 Rubber Soul<br>\n",
    "The Beatles 1970 Let it Be"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TO-DO: Validate the Data Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Carpenters 1970 Close To You\n"
     ]
    }
   ],
   "source": [
    "query = \"select * from album_library WHERE album_name='Close To You'\"\n",
    "try:\n",
    "    rows = session.execute(query)\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "    \n",
    "for row in rows:\n",
    "    print (row.artist_name, row.year, row.album_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Your output should be:\n",
    "The Carpenters 1970 Close To You"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### And finally close the session and cluster connection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"DROP TABLE music_library\"\n",
    "try:\n",
    "    rows = session.execute(query)\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "query1 = \"DROP TABLE artist_library\"\n",
    "try:\n",
    "    rows = session.execute(query1)\n",
    "except Exception as e:\n",
    "    print(e)\n",
    "query2 = \"DROP TABLE album_library\"\n",
    "try:\n",
    "    rows = session.execute(query2)\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "session.shutdown()\n",
    "cluster.shutdown()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
